National generation and fuel consumption

The data in this notebook is generation and consumption by fuel type for the entire US. These values are larger than what would be calculated by summing facility-level data. Note that the fuel types are somewhat aggregated (coal rather than BIT, SUB, LIG, etc). So when we multiply the fuel consumption by an emissions factor there will be some level of error.

The code assumes that you have already downloaded an ELEC.txt file from EIA's bulk download website.


In [1]:
import json
import pandas as pd
import os
from os.path import join
import numpy as np
import sys

cwd = os.getcwd()
data_path = join(cwd, '..', 'Data storage')
idx = pd.IndexSlice

Date string for filenames

This will be inserted into all filenames (reading and writing)


In [ ]:
file_date = '2018-03-06'

In [2]:
%load_ext watermark

In [3]:
%watermark -iv -v


json        2.0.9
pandas      0.21.1
numpy       1.13.3
CPython 3.6.3
IPython 6.2.1

In [4]:
# Load the "autoreload" extension
%load_ext autoreload

# always reload modules marked with "%aimport"
%autoreload 1

# add the 'src' directory as one where we can import modules
src_dir = join(os.getcwd(), os.pardir, 'src')
sys.path.append(src_dir)

In [5]:
%aimport Analysis.index
from Analysis.index import add_datetime, add_quarter

Read ELECT.txt file


In [55]:
cwd = os.getcwd()
path = join(data_path, 'Raw EIA bulk', '{} ELEC.txt'.format(file_date))
with open(path, 'r') as f:
    raw_txt = f.readlines()

Filter lines to only include total generation and fuel use

Only want monthly US data for all sectors

  • US-99.M
  • ELEC.GEN, ELEC.CONS_TOT_BTU, ELEC.CONS_EG_BTU
  • not ALL

Fuel codes:

  • WWW, wood and wood derived fuels
  • WND, wind
  • STH, solar thermal
  • WAS, other biomass
  • TSN, all solar
  • SUN, utility-scale solar
  • NUC, nuclear
  • NG, natural gas
  • PEL, petroleum liquids
  • SPV, utility-scale solar photovoltaic
  • PC, petroluem coke
  • OTH, other
  • COW, coal,
  • DPV, distributed photovoltaic
  • OOG, other gases
  • HPS, hydro pumped storage
  • HYC, conventional hydroelectric
  • GEO, geothermal
  • AOR, other renewables (total)

In [56]:
def line_to_df(line):
    """
    Takes in a line (dictionary), returns a dataframe
    """
    for key in ['latlon', 'source', 'copyright', 'description', 
                'geoset_id', 'iso3166', 'name', 'state']:
        line.pop(key, None)

    # Split the series_id up to extract information
    # Example: ELEC.PLANT.GEN.388-WAT-ALL.M
    series_id = line['series_id']
    series_id_list = series_id.split('.')
    # Use the second to last item in list rather than third
    plant_fuel_mover = series_id_list[-2].split('-')
    line['type'] = plant_fuel_mover[0]
#     line['state'] = plant_fuel_mover[1]
    line['sector'] = plant_fuel_mover[2]
    temp_df = pd.DataFrame(line)

    try:
        temp_df['year'] = temp_df.apply(lambda x: x['data'][0][:4], axis=1).astype(int)
        temp_df['month'] = temp_df.apply(lambda x: x['data'][0][-2:], axis=1).astype(int)
        temp_df['value'] = temp_df.apply(lambda x: x['data'][1], axis=1)
        temp_df.drop('data', axis=1, inplace=True)
        return temp_df
    except:
        exception_list.append(line)
        pass

In [57]:
states = ["AL", "AK", "AZ", "AR", "CA", "CO", "CT", "DE",
          "FL", "GA", "HI", "ID", "IL", "IN", "IA", "KS",
          "KY", "LA", "ME", "MD", "MA", "MI", "MN", "MS",
          "MO", "MT", "NE", "NV", "NH", "NJ", "NM", "NY",
          "NC", "ND", "OH", "OK", "OR", "PA", "RI", "SC",
          "SD", "TN", "TX", "UT", "VT", "VA", "WA", "WV", "WI", "WY"]

In [58]:
state_geos = ['USA-{}'.format(state) for state in states]

In [59]:
type(json.loads(raw_txt[0]))


Out[59]:
dict

In [60]:
json.loads(raw_txt[0])['geography']


Out[60]:
'USA-CA'

In [61]:
gen_rows = [row for row in raw_txt if 'ELEC.GEN' in row 
            and 'series_id' in row 
            and '-99.M' in row 
            and 'ALL' not in row]

total_fuel_rows = [row for row in raw_txt if 'ELEC.CONS_TOT_BTU' in row 
                   and 'series_id' in row 
                   and '-99.M' in row 
                   and 'ALL' not in row
                   and 'US-99.m' not in row]

eg_fuel_rows = [row for row in raw_txt if 'ELEC.CONS_EG_BTU' in row 
                and 'series_id' in row 
                and '-99.M' in row 
                and 'ALL' not in row
                and 'US-99.m' not in row]

All generation by fuel


In [62]:
gen_dicts = [json.loads(row) for row in gen_rows]

In [63]:
gen_df = pd.concat([line_to_df(x) for x in gen_dicts
                    if x['geography'] in state_geos])

In [64]:
#drop
gen_df.head()


Out[64]:
end f geography last_updated sector series_id start type units year month value
0 201712 M USA-AK 2018-02-28T02:03:13-05:00 99 ELEC.GEN.AOR-AK-99.M 200101 AOR thousand megawatthours 2017 12 19.39150
1 201712 M USA-AK 2018-02-28T02:03:13-05:00 99 ELEC.GEN.AOR-AK-99.M 200101 AOR thousand megawatthours 2017 11 19.76574
2 201712 M USA-AK 2018-02-28T02:03:13-05:00 99 ELEC.GEN.AOR-AK-99.M 200101 AOR thousand megawatthours 2017 10 20.52333
3 201712 M USA-AK 2018-02-28T02:03:13-05:00 99 ELEC.GEN.AOR-AK-99.M 200101 AOR thousand megawatthours 2017 9 13.56737
4 201712 M USA-AK 2018-02-28T02:03:13-05:00 99 ELEC.GEN.AOR-AK-99.M 200101 AOR thousand megawatthours 2017 8 11.61728

In [65]:
gen_df['geography'].unique()


Out[65]:
array(['USA-AK', 'USA-AZ', 'USA-CO', 'USA-MI', 'USA-DE', 'USA-MO',
       'USA-WA', 'USA-NY', 'USA-CA', 'USA-PA', 'USA-MN', 'USA-MS',
       'USA-MT', 'USA-WI', 'USA-MD', 'USA-OH', 'USA-ME', 'USA-KS',
       'USA-WY', 'USA-RI', 'USA-NV', 'USA-OK', 'USA-OR', 'USA-NC',
       'USA-AR', 'USA-WV', 'USA-SD', 'USA-HI', 'USA-GA', 'USA-ND',
       'USA-TN', 'USA-CT', 'USA-NE', 'USA-SC', 'USA-LA', 'USA-VA',
       'USA-IL', 'USA-AL', 'USA-IN', 'USA-MA', 'USA-IA', 'USA-KY',
       'USA-FL', 'USA-ID', 'USA-NM', 'USA-NJ', 'USA-UT', 'USA-TX',
       'USA-VT', 'USA-NH'], dtype=object)

Multiply generation values by 1000 and change the units to MWh


In [66]:
gen_df.loc[:,'value'] *= 1000
gen_df.loc[:,'units'] = 'megawatthours'

In [67]:
gen_df.rename_axis({'value':'generation (MWh)'}, axis=1, inplace=True)


C:\Users\gschivley\Anaconda2\envs\psci\lib\site-packages\ipykernel_launcher.py:1: FutureWarning: Using 'rename_axis' to alter labels is deprecated. Use '.rename' instead
  """Entry point for launching an IPython kernel.

In [68]:
gen_df.loc[gen_df.isnull().any(axis=1)]


Out[68]:
end f geography last_updated sector series_id start type units year month generation (MWh)
5 201712 M USA-AK 2018-02-28T02:03:13-05:00 99 ELEC.GEN.AOR-AK-99.M 200101 AOR megawatthours 2017 7 NaN
10 201712 M USA-AK 2018-02-28T02:03:13-05:00 99 ELEC.GEN.AOR-AK-99.M 200101 AOR megawatthours 2017 2 NaN
11 201712 M USA-AK 2018-02-28T02:03:13-05:00 99 ELEC.GEN.AOR-AK-99.M 200101 AOR megawatthours 2017 1 NaN
61 201712 M USA-AK 2018-02-28T02:03:13-05:00 99 ELEC.GEN.AOR-AK-99.M 200101 AOR megawatthours 2012 11 NaN
71 201712 M USA-AK 2018-02-28T02:03:13-05:00 99 ELEC.GEN.AOR-AK-99.M 200101 AOR megawatthours 2012 1 NaN
12 201712 M USA-MI 2018-02-28T02:03:13-05:00 99 ELEC.GEN.DPV-MI-99.M 201401 DPV megawatthours 2016 12 NaN
24 201712 M USA-WY 2018-02-28T02:03:13-05:00 99 ELEC.GEN.DPV-WY-99.M 201401 DPV megawatthours 2015 12 NaN
25 201712 M USA-WY 2018-02-28T02:03:13-05:00 99 ELEC.GEN.DPV-WY-99.M 201401 DPV megawatthours 2015 11 NaN
4 201712 M USA-HI 2018-02-28T02:03:13-05:00 99 ELEC.GEN.HYC-HI-99.M 200101 HYC megawatthours 2017 8 NaN
5 201712 M USA-HI 2018-02-28T02:03:13-05:00 99 ELEC.GEN.HYC-HI-99.M 200101 HYC megawatthours 2017 7 NaN
6 201712 M USA-HI 2018-02-28T02:03:13-05:00 99 ELEC.GEN.HYC-HI-99.M 200101 HYC megawatthours 2017 6 NaN
7 201712 M USA-HI 2018-02-28T02:03:13-05:00 99 ELEC.GEN.HYC-HI-99.M 200101 HYC megawatthours 2017 5 NaN
8 201712 M USA-HI 2018-02-28T02:03:13-05:00 99 ELEC.GEN.HYC-HI-99.M 200101 HYC megawatthours 2017 4 NaN
9 201712 M USA-HI 2018-02-28T02:03:13-05:00 99 ELEC.GEN.HYC-HI-99.M 200101 HYC megawatthours 2017 3 NaN
10 201712 M USA-HI 2018-02-28T02:03:13-05:00 99 ELEC.GEN.HYC-HI-99.M 200101 HYC megawatthours 2017 2 NaN
61 201712 M USA-OR 2018-02-28T02:03:13-05:00 99 ELEC.GEN.GEO-OR-99.M 201211 GEO megawatthours 2012 11 NaN
0 201712 M USA-GA 2018-02-28T02:03:13-05:00 99 ELEC.GEN.DPV-GA-99.M 201401 DPV megawatthours 2017 12 NaN
1 201712 M USA-GA 2018-02-28T02:03:13-05:00 99 ELEC.GEN.DPV-GA-99.M 201401 DPV megawatthours 2017 11 NaN
2 201712 M USA-GA 2018-02-28T02:03:13-05:00 99 ELEC.GEN.DPV-GA-99.M 201401 DPV megawatthours 2017 10 NaN
3 201712 M USA-GA 2018-02-28T02:03:13-05:00 99 ELEC.GEN.DPV-GA-99.M 201401 DPV megawatthours 2017 9 NaN
4 201712 M USA-GA 2018-02-28T02:03:13-05:00 99 ELEC.GEN.DPV-GA-99.M 201401 DPV megawatthours 2017 8 NaN
12 201712 M USA-GA 2018-02-28T02:03:13-05:00 99 ELEC.GEN.DPV-GA-99.M 201401 DPV megawatthours 2016 12 NaN
13 201712 M USA-GA 2018-02-28T02:03:13-05:00 99 ELEC.GEN.DPV-GA-99.M 201401 DPV megawatthours 2016 11 NaN
14 201712 M USA-GA 2018-02-28T02:03:13-05:00 99 ELEC.GEN.DPV-GA-99.M 201401 DPV megawatthours 2016 10 NaN
15 201712 M USA-GA 2018-02-28T02:03:13-05:00 99 ELEC.GEN.DPV-GA-99.M 201401 DPV megawatthours 2016 9 NaN
16 201712 M USA-GA 2018-02-28T02:03:13-05:00 99 ELEC.GEN.DPV-GA-99.M 201401 DPV megawatthours 2016 8 NaN
17 201712 M USA-GA 2018-02-28T02:03:13-05:00 99 ELEC.GEN.DPV-GA-99.M 201401 DPV megawatthours 2016 7 NaN
18 201712 M USA-GA 2018-02-28T02:03:13-05:00 99 ELEC.GEN.DPV-GA-99.M 201401 DPV megawatthours 2016 6 NaN
19 201712 M USA-GA 2018-02-28T02:03:13-05:00 99 ELEC.GEN.DPV-GA-99.M 201401 DPV megawatthours 2016 5 NaN
20 201712 M USA-GA 2018-02-28T02:03:13-05:00 99 ELEC.GEN.DPV-GA-99.M 201401 DPV megawatthours 2016 4 NaN
... ... ... ... ... ... ... ... ... ... ... ... ...
30 201702 M USA-WV 2017-05-24T14:26:30-04:00 99 ELEC.GEN.WAS-WV-99.M 200101 WAS megawatthours 2014 8 NaN
31 201702 M USA-WV 2017-05-24T14:26:30-04:00 99 ELEC.GEN.WAS-WV-99.M 200101 WAS megawatthours 2014 7 NaN
32 201702 M USA-WV 2017-05-24T14:26:30-04:00 99 ELEC.GEN.WAS-WV-99.M 200101 WAS megawatthours 2014 6 NaN
33 201702 M USA-WV 2017-05-24T14:26:30-04:00 99 ELEC.GEN.WAS-WV-99.M 200101 WAS megawatthours 2014 5 NaN
34 201702 M USA-WV 2017-05-24T14:26:30-04:00 99 ELEC.GEN.WAS-WV-99.M 200101 WAS megawatthours 2014 4 NaN
35 201702 M USA-WV 2017-05-24T14:26:30-04:00 99 ELEC.GEN.WAS-WV-99.M 200101 WAS megawatthours 2014 3 NaN
36 201702 M USA-WV 2017-05-24T14:26:30-04:00 99 ELEC.GEN.WAS-WV-99.M 200101 WAS megawatthours 2014 2 NaN
37 201702 M USA-WV 2017-05-24T14:26:30-04:00 99 ELEC.GEN.WAS-WV-99.M 200101 WAS megawatthours 2014 1 NaN
0 201410 M USA-AK 2016-12-19T17:19:30-05:00 99 ELEC.GEN.OOG-AK-99.M 201101 OOG megawatthours 2014 10 NaN
1 201410 M USA-AK 2016-12-19T17:19:30-05:00 99 ELEC.GEN.OOG-AK-99.M 201101 OOG megawatthours 2014 9 NaN
2 201410 M USA-AK 2016-12-19T17:19:30-05:00 99 ELEC.GEN.OOG-AK-99.M 201101 OOG megawatthours 2014 8 NaN
3 201410 M USA-AK 2016-12-19T17:19:30-05:00 99 ELEC.GEN.OOG-AK-99.M 201101 OOG megawatthours 2014 7 NaN
4 201410 M USA-AK 2016-12-19T17:19:30-05:00 99 ELEC.GEN.OOG-AK-99.M 201101 OOG megawatthours 2014 6 NaN
5 201410 M USA-AK 2016-12-19T17:19:30-05:00 99 ELEC.GEN.OOG-AK-99.M 201101 OOG megawatthours 2014 5 NaN
6 201410 M USA-AK 2016-12-19T17:19:30-05:00 99 ELEC.GEN.OOG-AK-99.M 201101 OOG megawatthours 2014 4 NaN
7 201410 M USA-AK 2016-12-19T17:19:30-05:00 99 ELEC.GEN.OOG-AK-99.M 201101 OOG megawatthours 2014 3 NaN
8 201410 M USA-AK 2016-12-19T17:19:30-05:00 99 ELEC.GEN.OOG-AK-99.M 201101 OOG megawatthours 2014 2 NaN
9 201410 M USA-AK 2016-12-19T17:19:30-05:00 99 ELEC.GEN.OOG-AK-99.M 201101 OOG megawatthours 2014 1 NaN
10 201410 M USA-AK 2016-12-19T17:19:30-05:00 99 ELEC.GEN.OOG-AK-99.M 201101 OOG megawatthours 2013 12 NaN
11 201410 M USA-AK 2016-12-19T17:19:30-05:00 99 ELEC.GEN.OOG-AK-99.M 201101 OOG megawatthours 2013 11 NaN
12 201410 M USA-AK 2016-12-19T17:19:30-05:00 99 ELEC.GEN.OOG-AK-99.M 201101 OOG megawatthours 2013 10 NaN
13 201410 M USA-AK 2016-12-19T17:19:30-05:00 99 ELEC.GEN.OOG-AK-99.M 201101 OOG megawatthours 2013 9 NaN
14 201410 M USA-AK 2016-12-19T17:19:30-05:00 99 ELEC.GEN.OOG-AK-99.M 201101 OOG megawatthours 2013 8 NaN
15 201410 M USA-AK 2016-12-19T17:19:30-05:00 99 ELEC.GEN.OOG-AK-99.M 201101 OOG megawatthours 2013 7 NaN
16 201410 M USA-AK 2016-12-19T17:19:30-05:00 99 ELEC.GEN.OOG-AK-99.M 201101 OOG megawatthours 2013 6 NaN
17 201410 M USA-AK 2016-12-19T17:19:30-05:00 99 ELEC.GEN.OOG-AK-99.M 201101 OOG megawatthours 2013 5 NaN
18 201410 M USA-AK 2016-12-19T17:19:30-05:00 99 ELEC.GEN.OOG-AK-99.M 201101 OOG megawatthours 2013 4 NaN
19 201410 M USA-AK 2016-12-19T17:19:30-05:00 99 ELEC.GEN.OOG-AK-99.M 201101 OOG megawatthours 2013 3 NaN
20 201410 M USA-AK 2016-12-19T17:19:30-05:00 99 ELEC.GEN.OOG-AK-99.M 201101 OOG megawatthours 2013 2 NaN
21 201410 M USA-AK 2016-12-19T17:19:30-05:00 99 ELEC.GEN.OOG-AK-99.M 201101 OOG megawatthours 2013 1 NaN

900 rows × 12 columns


In [69]:
gen_df.dropna(inplace=True)

In [70]:
gen_df.set_index(['type', 'year', 'month', 'geography'], inplace=True)

Total fuel consumption


In [71]:
total_fuel_dict = [json.loads(row) for row in total_fuel_rows]

In [72]:
total_fuel_df = pd.concat([line_to_df(x) for x in total_fuel_dict
                           if x['geography'] in state_geos])

Multiply generation values by 1,000,000 and change the units to MMBtu


In [73]:
total_fuel_df.loc[:,'value'] *= 1E6
total_fuel_df.loc[:,'units'] = 'mmbtu'

In [74]:
total_fuel_df.rename_axis({'value':'total fuel (mmbtu)'}, axis=1, inplace=True)


C:\Users\gschivley\Anaconda2\envs\psci\lib\site-packages\ipykernel_launcher.py:1: FutureWarning: Using 'rename_axis' to alter labels is deprecated. Use '.rename' instead
  """Entry point for launching an IPython kernel.

In [75]:
total_fuel_df.set_index(['type', 'year', 'month', 'geography'], inplace=True)

Drop nans


In [76]:
total_fuel_df.loc[total_fuel_df.isnull().any(axis=1)]


Out[76]:
end f last_updated sector series_id start units total fuel (mmbtu)
type year month geography
COW 2017 6 USA-ME 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.COW-ME-99.M 200101 mmbtu NaN
12 USA-AK 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.COW-AK-99.M 200101 mmbtu NaN
7 USA-AK 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.COW-AK-99.M 200101 mmbtu NaN
6 USA-AK 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.COW-AK-99.M 200101 mmbtu NaN
3 USA-AK 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.COW-AK-99.M 200101 mmbtu NaN
12 USA-ID 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.COW-ID-99.M 200101 mmbtu NaN
11 USA-ID 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.COW-ID-99.M 200101 mmbtu NaN
10 USA-ID 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.COW-ID-99.M 200101 mmbtu NaN
9 USA-ID 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.COW-ID-99.M 200101 mmbtu NaN
8 USA-ID 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.COW-ID-99.M 200101 mmbtu NaN
7 USA-ID 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.COW-ID-99.M 200101 mmbtu NaN
6 USA-ID 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.COW-ID-99.M 200101 mmbtu NaN
5 USA-ID 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.COW-ID-99.M 200101 mmbtu NaN
4 USA-ID 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.COW-ID-99.M 200101 mmbtu NaN
3 USA-ID 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.COW-ID-99.M 200101 mmbtu NaN
2 USA-ID 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.COW-ID-99.M 200101 mmbtu NaN
1 USA-ID 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.COW-ID-99.M 200101 mmbtu NaN
2015 1 USA-HI 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.COW-HI-99.M 200101 mmbtu NaN
PEL 2017 12 USA-MT 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-MT-99.M 200101 mmbtu NaN
11 USA-MT 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-MT-99.M 200101 mmbtu NaN
8 USA-MT 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-MT-99.M 200101 mmbtu NaN
2015 2 USA-KY 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-KY-99.M 200101 mmbtu NaN
2017 8 USA-KS 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-KS-99.M 200101 mmbtu NaN
12 USA-LA 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-LA-99.M 200101 mmbtu NaN
11 USA-LA 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-LA-99.M 200101 mmbtu NaN
10 USA-LA 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-LA-99.M 200101 mmbtu NaN
9 USA-LA 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-LA-99.M 200101 mmbtu NaN
8 USA-LA 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-LA-99.M 200101 mmbtu NaN
7 USA-LA 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-LA-99.M 200101 mmbtu NaN
6 USA-LA 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-LA-99.M 200101 mmbtu NaN
... ... ... ... ... ... ... ... ... ...
9 USA-NE 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-NE-99.M 200101 mmbtu NaN
8 USA-NE 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-NE-99.M 200101 mmbtu NaN
7 USA-NE 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-NE-99.M 200101 mmbtu NaN
6 USA-NE 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-NE-99.M 200101 mmbtu NaN
5 USA-NE 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-NE-99.M 200101 mmbtu NaN
4 USA-NE 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-NE-99.M 200101 mmbtu NaN
3 USA-NE 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-NE-99.M 200101 mmbtu NaN
2 USA-NE 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-NE-99.M 200101 mmbtu NaN
1 USA-NE 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-NE-99.M 200101 mmbtu NaN
6 USA-NM 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-NM-99.M 200101 mmbtu NaN
11 USA-NJ 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-NJ-99.M 200101 mmbtu NaN
5 USA-NJ 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-NJ-99.M 200101 mmbtu NaN
3 USA-NJ 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-NJ-99.M 200101 mmbtu NaN
2 USA-NJ 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-NJ-99.M 200101 mmbtu NaN
2015 2 USA-NV 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-NV-99.M 200101 mmbtu NaN
2013 12 USA-PA 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-PA-99.M 200101 mmbtu NaN
2 USA-PA 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-PA-99.M 200101 mmbtu NaN
PC 2014 12 USA-NJ 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PC-NJ-99.M 201101 mmbtu NaN
6 USA-NJ 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PC-NJ-99.M 201101 mmbtu NaN
5 USA-NJ 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PC-NJ-99.M 201101 mmbtu NaN
4 USA-NJ 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PC-NJ-99.M 201101 mmbtu NaN
3 USA-NJ 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PC-NJ-99.M 201101 mmbtu NaN
2 USA-NJ 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PC-NJ-99.M 201101 mmbtu NaN
1 USA-NJ 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PC-NJ-99.M 201101 mmbtu NaN
PEL 2017 6 USA-CO 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-CO-99.M 200101 mmbtu NaN
4 USA-CO 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-CO-99.M 200101 mmbtu NaN
1 USA-CO 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-CO-99.M 200101 mmbtu NaN
12 USA-IL 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-IL-99.M 200101 mmbtu NaN
4 USA-CT 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-CT-99.M 200101 mmbtu NaN
NG 2017 10 USA-MO 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.NG-MO-99.M 200101 mmbtu NaN

197 rows × 8 columns


In [77]:
total_fuel_df = total_fuel_df.dropna()

Electric generation fuel consumption


In [78]:
eg_fuel_dict = [json.loads(row) for row in eg_fuel_rows]

In [79]:
eg_fuel_df = pd.concat([line_to_df(x) for x in eg_fuel_dict
                        if x['geography'] in state_geos])

Multiply generation values by 1,000,000 and change the units to MMBtu


In [80]:
eg_fuel_df.loc[:,'value'] *= 1E6
eg_fuel_df.loc[:,'units'] = 'mmbtu'

In [81]:
eg_fuel_df.rename_axis({'value':'elec fuel (mmbtu)'}, axis=1, inplace=True)


C:\Users\gschivley\Anaconda2\envs\psci\lib\site-packages\ipykernel_launcher.py:1: FutureWarning: Using 'rename_axis' to alter labels is deprecated. Use '.rename' instead
  """Entry point for launching an IPython kernel.

In [82]:
eg_fuel_df.set_index(['type', 'year', 'month', 'geography'], inplace=True)

In [83]:
#drop
eg_fuel_df.head()


Out[83]:
end f last_updated sector series_id start units elec fuel (mmbtu)
type year month geography
PC 2017 12 USA-NJ 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_EG_BTU.PC-NJ-99.M 201101 mmbtu 22330.0
11 USA-NJ 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_EG_BTU.PC-NJ-99.M 201101 mmbtu 21780.0
10 USA-NJ 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_EG_BTU.PC-NJ-99.M 201101 mmbtu 18450.0
9 USA-NJ 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_EG_BTU.PC-NJ-99.M 201101 mmbtu 22760.0
8 USA-NJ 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_EG_BTU.PC-NJ-99.M 201101 mmbtu 20030.0

I verified on EIA's website that the values below are correct.


In [84]:
#drop
eg_fuel_df.loc[~(eg_fuel_df['elec fuel (mmbtu)'] >= 0) &
                  ~(eg_fuel_df['elec fuel (mmbtu)'].isnull())]


Out[84]:
end f last_updated sector series_id start units elec fuel (mmbtu)
type year month geography
PEL 2002 12 USA-MN 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_EG_BTU.PEL-MN-99.M 200101 mmbtu -43000.0
11 USA-MN 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_EG_BTU.PEL-MN-99.M 200101 mmbtu -32000.0
10 USA-MN 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_EG_BTU.PEL-MN-99.M 200101 mmbtu -15000.0
8 USA-MN 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_EG_BTU.PEL-MN-99.M 200101 mmbtu -16000.0
7 USA-MN 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_EG_BTU.PEL-MN-99.M 200101 mmbtu -1000.0
4 USA-MN 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_EG_BTU.PEL-MN-99.M 200101 mmbtu -6000.0
3 USA-MN 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_EG_BTU.PEL-MN-99.M 200101 mmbtu -10000.0
2 USA-MN 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_EG_BTU.PEL-MN-99.M 200101 mmbtu -30000.0
1 USA-MN 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_EG_BTU.PEL-MN-99.M 200101 mmbtu -34000.0

In [85]:
eg_fuel_df.dropna(inplace=True)

Combine three datasets

Need to estimate fuel use for OOG, because EIA doesn't include any (this is only ~2% of OOG fuel for electricity in 2015).


In [86]:
fuel_df = pd.concat([total_fuel_df, eg_fuel_df['elec fuel (mmbtu)']], axis=1)

Not sure how this happens in EIA's data, but we do see the negative fuel consumption for electricity generation.


In [87]:
#drop
fuel_df.loc[~(fuel_df['elec fuel (mmbtu)']>=0)]


Out[87]:
end f last_updated sector series_id start units total fuel (mmbtu) elec fuel (mmbtu)
type year month geography
PEL 2002 1 USA-MN 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-MN-99.M 200101 mmbtu 51000.0 -34000.0
2 USA-MN 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-MN-99.M 200101 mmbtu 62000.0 -30000.0
3 USA-MN 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-MN-99.M 200101 mmbtu 99000.0 -10000.0
4 USA-MN 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-MN-99.M 200101 mmbtu 84000.0 -6000.0
7 USA-MN 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-MN-99.M 200101 mmbtu 93000.0 -1000.0
8 USA-MN 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-MN-99.M 200101 mmbtu 66000.0 -16000.0
10 USA-MN 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-MN-99.M 200101 mmbtu 64000.0 -15000.0
11 USA-MN 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-MN-99.M 200101 mmbtu 49000.0 -32000.0
12 USA-MN 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.PEL-MN-99.M 200101 mmbtu 50000.0 -43000.0

In [88]:
#drop
fuel_df.loc[~(fuel_df['total fuel (mmbtu)']>=0)]


Out[88]:
end f last_updated sector series_id start units total fuel (mmbtu) elec fuel (mmbtu)
type year month geography

Add CO2 emissions

The difficulty here is that EIA combines all types of coal fuel consumption together in the bulk download and API. Fortunately the emission factors for different coal types aren't too far off on an energy basis (BIT is 93.3 kg/mmbtu, SUB is 97.2 kg/mmbtu). I'm going to average the BIT and SUB factors rather than trying to do something more complicated. In 2015 BIT represented 45% of coal energy for electricity and SUB represented 48%.

Same issue with petroleum liquids. Using the average of DFO and RFO, which were the two largest share of petroleum liquids.


In [89]:
path = join(data_path, 'Final emission factors.csv')
ef = pd.read_csv(path, index_col=0)

Match general types with specific fuel codes

Fuel codes:

  • WWW, wood and wood derived fuels
  • WND, wind
  • STH, solar thermal
  • WAS, other biomass
  • TSN, all solar
  • SUN, utility-scale solar
  • NUC, nuclear
  • NG, natural gas
  • PEL, petroleum liquids
  • SPV, utility-scale solar photovoltaic
  • PC, petroluem coke
  • OTH, other
  • COW, coal,
  • DPV, distributed photovoltaic
  • OOG, other gases
  • HPS, hydro pumped storage
  • HYC, conventional hydroelectric
  • GEO, geothermal
  • AOR, other renewables (total)

In [90]:
fuel_factors = pd.Series({'NG' : ef.loc['NG', 'Fossil Factor'],
                   'PEL': ef.loc[['DFO', 'RFO'], 'Fossil Factor'].mean(),
                   'PC' : ef.loc['PC', 'Fossil Factor'], 
                   'COW' : ef.loc[['BIT', 'SUB'], 'Fossil Factor'].mean(),
                   'OOG' : ef.loc['OG', 'Fossil Factor']}, name='type')

In [91]:
#drop
fuel_factors


Out[91]:
COW     95.250
NG      53.070
OOG     59.000
PC     102.100
PEL     75.975
Name: type, dtype: float64

In [92]:
fuel_df['all fuel CO2 (kg)'] = (fuel_df['total fuel (mmbtu)']
                                .multiply(fuel_factors, level='type',
                                          fill_value=0))
fuel_df['elec fuel CO2 (kg)'] = (fuel_df['elec fuel (mmbtu)']
                                .multiply(fuel_factors, level='type',
                                          fill_value=0))

In [93]:
fuel_df.head()


Out[93]:
end f last_updated sector series_id start units total fuel (mmbtu) elec fuel (mmbtu) all fuel CO2 (kg) elec fuel CO2 (kg)
type year month geography
COW 2001 1 USA-AK 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.COW-AK-99.M 200101 mmbtu 1120000.0 872000.0 1.066800e+08 8.305800e+07
USA-AL 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.COW-AL-99.M 200101 mmbtu 67999000.0 66582000.0 6.476905e+09 6.341935e+09
USA-AR 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.COW-AR-99.M 200101 mmbtu 23099000.0 22700000.0 2.200180e+09 2.162175e+09
USA-AZ 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.COW-AZ-99.M 200101 mmbtu 35873000.0 35483000.0 3.416903e+09 3.379756e+09
USA-CA 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.CONS_TOT_BTU.COW-CA-99.M 200101 mmbtu 3652000.0 2008000.0 3.478530e+08 1.912620e+08

In [94]:
fuel_cols = ['total fuel (mmbtu)', 'elec fuel (mmbtu)',
              'all fuel CO2 (kg)', 'elec fuel CO2 (kg)']
gen_fuel_df = pd.concat([gen_df, fuel_df[fuel_cols]], axis=1)

Add datetime and quarter columns


In [95]:
add_quarter(gen_fuel_df)

In [96]:
gen_fuel_df.head()


Out[96]:
end f last_updated sector series_id start units generation (MWh) total fuel (mmbtu) elec fuel (mmbtu) all fuel CO2 (kg) elec fuel CO2 (kg) datetime quarter
type year month geography
AOR 2001 1 USA-AK 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.GEN.AOR-AK-99.M 200101 megawatthours 87.00 NaN NaN NaN NaN 2001-01-01 1
USA-AL 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.GEN.AOR-AL-99.M 200101 megawatthours 401167.59 NaN NaN NaN NaN 2001-01-01 1
USA-AR 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.GEN.AOR-AR-99.M 200101 megawatthours 136530.37 NaN NaN NaN NaN 2001-01-01 1
USA-AZ 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.GEN.AOR-AZ-99.M 200101 megawatthours 453.00 NaN NaN NaN NaN 2001-01-01 1
USA-CA 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.GEN.AOR-CA-99.M 200101 megawatthours 1717398.41 NaN NaN NaN NaN 2001-01-01 1

No records with positive fuel use but no generation


In [97]:
gen_fuel_df['generation (MWh)'].fillna(value=0, inplace=True)

In [98]:
gen_fuel_df.loc['COW',:].head()


Out[98]:
end f last_updated sector series_id start units generation (MWh) total fuel (mmbtu) elec fuel (mmbtu) all fuel CO2 (kg) elec fuel CO2 (kg) datetime quarter
year month geography
2001 1 USA-AK 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.GEN.COW-AK-99.M 200101 megawatthours 46903.0 1120000.0 872000.0 1.066800e+08 8.305800e+07 2001-01-01 1
USA-AL 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.GEN.COW-AL-99.M 200101 megawatthours 6557913.0 67999000.0 66582000.0 6.476905e+09 6.341935e+09 2001-01-01 1
USA-AR 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.GEN.COW-AR-99.M 200101 megawatthours 2149808.0 23099000.0 22700000.0 2.200180e+09 2.162175e+09 2001-01-01 1
USA-AZ 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.GEN.COW-AZ-99.M 200101 megawatthours 3418454.0 35873000.0 35483000.0 3.416903e+09 3.379756e+09 2001-01-01 1
USA-CA 201712 M 2018-02-28T02:03:13-05:00 99 ELEC.GEN.COW-CA-99.M 200101 megawatthours 199857.0 3652000.0 2008000.0 3.478530e+08 1.912620e+08 2001-01-01 1

Export data

State-level


In [99]:
path = join(data_path, 'EIA state-level gen fuel CO2 2018-03-06.csv')
gen_fuel_df.to_csv(path)

National totals


In [100]:
nat_gen_fuel = gen_fuel_df.groupby(['type', 'year', 'month']).sum()
add_quarter(nat_gen_fuel)

In [101]:
nat_gen_fuel.tail()


Out[101]:
generation (MWh) total fuel (mmbtu) elec fuel (mmbtu) all fuel CO2 (kg) elec fuel CO2 (kg) quarter datetime
type year month
WWW 2017 8 3879727.60 NaN NaN NaN NaN 3 2017-08-01
9 3357157.33 NaN NaN NaN NaN 3 2017-09-01
10 3558432.79 NaN NaN NaN NaN 4 2017-10-01
11 3512302.55 NaN NaN NaN NaN 4 2017-11-01
12 3831624.38 NaN NaN NaN NaN 4 2017-12-01

In [102]:
path = join(data_path, 'Derived data',
            'EIA country-wide gen fuel CO2 {}.csv'.format(file_date))
gen_fuel_df.to_csv(path)

In [ ]: